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Abstract 

In  physical  domains  (military  or  athletic),  team  be¬ 
haviors  often  have  an  observable  spatio-temporal 
structure,  defined  by  the  relative  physical  positions 
of  team  members  over  time.  In  this  paper,  we 
demonstrate  that  this  structure  can  be  exploited  to 
recognize  football  plays  in  the  Rush  2008  foot¬ 
ball  simulator.  Although  events  in  the  simulator 
are  stochastically  generated,  we  present  a  method 
for  reliably  recognizing  football  plays  at  a  very 
early  stage  using  multiple  support  vector  machines; 
moreover,  we  demonstrate  that  having  this  early  in¬ 
formation  about  the  defense’s  intent  can  be  utilized 
to  improve  offensive  team  play.  Our  system  evalu¬ 
ates  the  competitive  advantage  of  executing  a  play 
switch  based  on  the  potential  of  other  plays  to  in¬ 
crease  the  yardage  gained  and  the  similarity  of  the 
candidate  plays  to  the  current  play.  Our  play  switch 
selection  mechanism  outperforms  both  the  built-in 
offense  and  a  greedy  yardage-based  switching  strat¬ 
egy. 


Figure  1:  Screenshot  of  the  Rush  2008  football  simulator. 
The  offense  team  (shown  in  red)  is  using  the  play  split  8  and 
being  countered  by  the  defense  (shown  in  blue)  using  a  31 
formation  (variant  1). 


1  Introduction 

This  paper  addresses  the  problem  of  early  recognition  of  op¬ 
ponent  intent  in  adversarial  team  games.  In  physical  domains 
(military  or  athletic),  team  behaviors  often  have  an  observable 
spatio-temporal  structure,  defined  by  the  relative  physical  po¬ 
sitions  of  team  members.  This  structure  can  be  exploited  to 
perform  behavior  recognition  on  traces  of  agent  activity  over 
time.  There  are  three  general  types  of  cues  that  can  be  used 
to  perform  recognition: 

•  spatial  relationships  between  team  members  and/or 
physical  landmarks  that  remain  fixed  over  a  period  of 
time; 

•  temporal  dependencies  between  behaviors  in  a  plan  or 
between  actions  in  a  team  behavior; 

•  coordination  constraints  between  agents  and  the  actions 
that  they  are  performing. 

This  paper  describes  a  method  for  recognizing  defensive 
plays  from  spatio-temporal  traces  of  player  movement  in  the 


Rush  2008  football  game  (see  Figure  1)  and  using  this  infor¬ 
mation  to  improve  offensive  play.  To  succeed  at  American 
football,  a  team  must  be  able  to  successfully  execute  closely- 
coordinated  physical  behavior.  To  achieve  this  tight  physi¬ 
cal  coordination,  teams  rely  upon  a  pre-existing  playbook  of 
offensive  maneuvers  [Association,  2000b]  to  move  the  ball 
down  the  field  and  defensive  strategies  [Association,  2000a] 
to  counter  the  opposing  team’s  attempts  to  make  yardage 
gains.  Rush  2008  simulates  a  modified  version  of  Ameri¬ 
can  football  with  8  players  per  team;  plays  in  Rush  are  com¬ 
posed  of  a  starting  formation  and  instructions  for  each  player 
in  the  formation.  These  instructions  are  similar  to  a  condi¬ 
tional  plan  and  include  choice  points  where  the  players  can 
make  individual  decisions  as  well  as  pre-defined  behaviors 
that  the  player  executes  to  the  best  of  its  physical  capability. 

Although  there  have  been  other  studies  examining  the 
problem  of  recognizing  completed  football  plays,  we  present 
results  on  recognizing  football  plays  online  at  an  early  stage 
of  play,  and  demonstrate  a  mechanism  for  exploiting  this 
knowledge  to  improve  a  team’s  offense.  Our  system  eval¬ 
uates  the  competitive  advantage  of  executing  a  play  switch 
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based  on  the  potential  of  other  plays  to  improve  the  yardage 
gained  and  the  similarity  of  the  candidate  plays  to  the  cur¬ 
rent  play.  Our  play  switch  selection  mechanism  outperforms 
both  the  built-in  offense  and  a  greedy  yardage-based  switch¬ 
ing  strategy.  Calculating  the  relative  similarity  of  the  current 
play  compared  to  the  proposed  play  is  shown  to  be  a  nec¬ 
essary  step  to  reduce  confusion  on  the  field  and  effectively 
boost  performance. 

The  remainder  of  the  paper  is  organized  as  follows.  Sec¬ 
tion  2  summarizes  related  work  on  team  behavior  recogni¬ 
tion  in  games  and  simulation  environments.  Section  3  de¬ 
scribes  the  Rush  2008  football  simulator,  which  was  devel¬ 
oped  from  the  open  source  Rush  2005  football  game  [Rush, 

2005] .  The  potential  for  increasing  yardage  gains  through 
intelligent  play  choice  is  discussed  in  Section  4.  Section  5 
presents  our  support- vector  based  classification  approach  for 
early  play  recognition  and  results  for  defensive  play  recogni¬ 
tion.  In  Section  6  we  present  our  play  switching  mechanism 
and  introduce  the  play  similarity  metric  used  for  calculating 
switches.  We  present  results  for  our  offensive  play  improve¬ 
ment  procedure  (described  in  Section  7)  in  Section  8,  before 
concluding  the  paper  with  some  discussion  on  the  potential 
uses  of  early  intent  recognition. 

2  Related  Work 

Previous  work  on  team  behavior  recognition  has  been  pri¬ 
marily  evaluated  within  athletic  domains,  including  Amer¬ 
ican  football  [Intille  and  Bobick,  1999],  basketball  [Bhan- 
dari  et  al ,  1997;  Jug  et  al ,  2003],  and  Robocup  soccer  sim¬ 
ulations  [Riley  and  Veloso,  2000;  2002;  Kuhlmann  et  al , 

2006]  and  non-athletic  domains  [Huber  and  Hadley,  1997; 
Vail  et  al .,  2007].  To  recognize  athletic  behaviors,  researchers 
have  exploited  simple  region-based  [Intille  and  Bobick,  1999] 
or  distance-based  [Riley  and  Veloso,  2002]  heuristics  to  build 
accurate,  but  domain- specific  classifiers.  For  instance,  based 
on  the  premise  that  all  behaviors  always  occur  on  the  same 
playing  field  with  a  known  number  of  entities,  it  is  often  pos¬ 
sible  to  divide  the  playing  field  into  grids  or  typed  regions 
(e.g.,  goal,  scrimmage  line)  that  can  be  used  to  classify  player 
actions.  In  contrast,  we  train  our  classifiers  on  raw  observa¬ 
tion  traces  and  do  not  rely  on  a  field-based  marker  system. 

Most  of  the  camera-based  sports  analysis  work  has  focused 
on  extracting  observation  traces,  addressing  problems  such 
as  field  rectification  and  player  tracking  [Intille  and  Bobick, 
1994])  and  has  spent  relatively  little  effort  on  play  recognition 
and  opponent  modeling.  In  Intille  and  Bobick’ s  original  sys¬ 
tem,  football  play  recognition  [1999]  is  performed  on  player 
trajectories  using  belief  networks  both  to  recognize  agent  ac¬ 
tions  from  visual  evidence  (e.g.,  catching  a  pass)  and  to  de¬ 
termine  the  temporal  relations  between  actions  (e.g.  before, 
after,  around).  Jug  et  al.  [2003]  used  a  similar  framework 
for  offline  basketball  game  analysis.  More  recently,  Hess  et 
al.  demonstrated  the  use  of  a  pictorial  structure  model  to  clas¬ 
sify  football  formations  from  snapshots  [Hess  et  al ,  2007]. 
These  systems  were  used  for  post-game  analysis  of  forma¬ 
tions  and  behaviors  only  and  did  not  address  the  problem  of 
online  intention  recognition. 

In  Robocup,  there  has  been  some  research  on  team  in¬ 


tent  recognition  geared  towards  the  Robocup  coach  compe¬ 
tition.  Techniques  have  been  developed  to  extract  specific 
information,  such  as  home  areas  [Riley  et  al ,  2002],  op¬ 
ponent  positions  during  set-plays  [Riley  and  Veloso,  2002], 
and  adversarial  models  [Riley  and  Veloso,  2000],  from  logs 
of  Robocup  simulation  league  games.  This  information  can 
be  utilized  by  the  coach  agent  to  improve  the  team’s  scor¬ 
ing  performance.  For  instance,  information  about  opponent 
agent  home  areas  can  be  used  triggers  for  coaching  advice 
and  for  doing  “formation-based  marking”,  in  which  different 
team  members  are  assigned  to  track  members  of  the  oppos¬ 
ing  team.  However,  the  focus  of  the  coaching  agents  is  to 
improve  performance  of  teams  in  future  games;  our  system 
immediately  takes  action  on  the  recognized  play  to  evaluate 
possible  play  switches. 

3  Rush  Football 

Football  is  a  contest  of  two  teams  played  on  a  rectangular 
field  that  is  bordered  on  lengthwise  sides  by  an  end  zone.  Un¬ 
like  American  football,  Rush  teams  only  have  8  players  on  the 
field  at  a  time  out  of  a  roster  of  18  players,  and  the  field  is 
100  yards  by  63  yards.  The  game’s  objective  is  to  out-score 
the  opponent,  where  the  offense  (i.e.,  the  team  with  posses¬ 
sion  of  the  ball),  attempts  to  advance  the  ball  from  the  line  of 
scrimmage  into  their  opponent’s  end  zone.  In  a  full  game,  the 
offensive  team  has  four  attempts  to  get  a  first  down  by  mov¬ 
ing  the  ball  10  yards  down  the  field.  If  the  ball  is  intercepted 
or  fumbled,  ball  possession  transfers  to  the  defensive  team. 


Power  vs  3 1  Split  vs  223 1 


Figure  2:  Two  offensive  and  defensive  configurations.  Offen¬ 
sive  players  are  shown  on  the  bottom  half  and  the  defense  the 
top  half. 

A  Rush  play  is  composed  of  (1)  a  starting  formation  and  (2) 
instructions  for  each  player  in  that  formation.  A  formation  is 
a  set  of  (x,y)  offsets  from  the  center  of  the  line  of  scrimmage. 
By  default,  directions  for  each  player  consist  of  (a)  an  off¬ 
set/destination  point  on  the  field  to  run  to,  and  (b)  a  behavior 
to  execute  when  they  get  there.  Play  instructions  are  simi¬ 
lar  to  a  conditional  plan  and  include  choice  points  where  the 
players  can  make  individual  decisions  as  well  as  pre-defined 
behaviors  that  the  player  executes  to  the  best  of  their  physical 
capability.  Rush  includes  three  offensive  formations  (power, 
pro,  and  split)  and  four  defensive  ones  (23, 31, 2222, 2231)  2. 
Each  formation  has  eight  different  plays  (numbered  1-8)  that 
can  be  executed  from  that  formation.  Offensive  plays  typi¬ 
cally  include  a  handoff  to  the  running  back/fullback  or  a  pass 


Table  1 :  Offensive  Plays  from  the  Power  Formation 


Play  Variant 

Description 

1 

handoff  to  RB 

2 

handoff  to  RB 

3 

handoff  to  RB 

4 

handoff  to  FB 

5 

pass  towards  the  left 

6 

pass  using  hook  routes 

7 

pass  to  FB 

8 

general  pass  play 

executed  by  the  quarterback  to  one  of  the  receivers,  along 
with  instructions  for  a  running  pattern  to  be  followed  by  all 
the  receivers.  An  example  play  from  the  split  formation  is 
given  below: 

•  the  quarterback  will  pass  to  an  open  receiver; 

•  the  running  back  and  fullback  will  run  hook  routes; 

•  the  left  wide  receiver  will  run  a  corner  right  route; 

•  the  right  wide  receiver  will  run  a  hook  route; 

•  the  other  players  will  block  for  the  ball  holder. 

Figure  1  shows  an  example  execution  of  the  above  pass¬ 
ing  play  being  countered  by  the  the  defense  using  a  31  for¬ 
mation  (variant  1).  The  quarterback  has  already  thrown  the 
ball  which  is  currently  in  the  air  between  the  40  and  50  yard 
lines.  In  Rush  defensive  plays,  the  players  are  given  the  role 
of  guarding  zones  of  the  field  or  pursuing  specific  offensive 
players.  In  Figure  1,  the  defense  is  countering  with  this  allo¬ 
cation  of  players  to  tasks: 

•  the  defensive  linemen  are  chasing  the  quarterback; 

•  the  linebacker  is  pursuing  the  running  back; 

•  the  cornerbacks  are  following  their  respective  wide  re¬ 
ceivers; 

•  Safety  1  is  guarding  the  high  zone  (top  portion  of  the 
field); 

•  Safety  2  is  guarding  the  middle  zone  (middle  portion  of 
the  field); 

Table  1  gives  general  descriptions  of  possible  plays  that 
can  be  executed  from  the  power  starting  positions  using  the 
lineup:  quarterback  (QB),  running  back  (RB),  fullback  (FB), 
wide  receiver  1  (WR1),  tight  end  1  (TE1),  offensive  line¬ 
man  1  (OL1),  offensive  lineman  2  (OL2),  offensive  lineman 
3  (OL3).  Rush  teams  have  a  roster  of  offensive  and  defensive 
players,  each  possessing  unique  physical  capabilities,  which 
are  specified  in  a  game  configuration  file  using  a  ten  point 
scale  to  designate  the  player’s  power,  speed,  skill,  and  en¬ 
durance.  The  team  compositions  are  loosely  modeled  after 
players  on  various  NFL  teams.  The  experiments  described  in 
this  paper  were  run  with  the  Atlanta  Falcons  (offense)  vs.  the 
New  England  Patriots  (defense). 

A  player’s  physical  capabilities  affect  his  running  speed, 
ability  to  handle  the  ball,  and  ability  to  block  and  tackle  other 
players.  In  a  mechanical  sense,  Rush  treats  both  players  and 


the  ball  as  2-dimensional  rectangular  objects  capable  of  in¬ 
finite  acceleration.  As  soon  as  a  player  or  the  ball  starts  to 
move,  it  takes  on  a  constant  velocity,  with  the  exception  that 
the  ball  will  accelerate  downwards  due  to  gravity.  When  ob¬ 
jects  overlap,  a  collision  occurs.  A  collision  between  players 
may  result  in  a  tackle  if  a  player  is  carrying  the  ball  or  per¬ 
forming  a  block  for  the  ball  carrier. 

4  Competitive  Advantage  of 
Intention  Recognition 

This  section  discusses  how  knowledge  of  the  opponents’  in¬ 
tended  play  affects  the  yardage  gained  by  teams  in  Rush 
2008.  Although  the  players’  physical  capabilities  affect  the 
outcome  of  individual  events,  such  as  blocks,  handoffs,  and 
distance  covered,  the  play  choice  is  of  key  importance  in  de¬ 
termining  the  total  yardage  gained  on  a  play.  Each  play  spec¬ 
ifies  a  particular  allocation  of  players  to  tasks  and  positions; 
different  allocations  leave  various  openings  in  the  field  which 
could  be  exploited  by  the  opponent.  Unlike  in  real  football,  it 
is  not  possible  for  players  to  conceal  the  location  of  the  ball  in 
the  Rush  simulator  to  divert  attention  away  from  the  ball  car¬ 
rier.  However,  we  show  that  recognizing  the  opponents’  in¬ 
tention  can  still  confer  significant  competitive  advantage  by 
examining  the  expected  outcome  of  teams  executing  differ¬ 
ent  play  combinations.  For  the  purposes  of  this  paper,  we 
focus  on  yardage  gained  over  a  single  play,  independent  of 
the  team’s  down  or  other  plays  that  the  team  has  executed  in 
the  past. 

To  study  the  effectiveness  of  different  plays,  we  ran  each 
play  combination  50  times  (a  total  of  38,400  games)  to  de¬ 
termine  the  expected  yardage  for  each  play  combination  in 
the  teams’  playbooks  and  examined  the  impact  of  play  se¬ 
lection  on  yardage  gained.  Table  2  clearly  shows  that  there 
is  a  large  difference  between  the  best  response  case  for  the 
offense  (the  offense  playing  their  best  play  vs.  the  worst  de¬ 
fense),  the  worst  scenario  (the  offense  playing  their  weakest 
play  vs.  the  best  defense),  and  the  average  yardage  gained  for 
all  play  combinations  using  that  pair  of  formations.  Although 
we  don’t  have  any  direct  control  over  the  defense’s  choice  of 
formation  and  play  variant,  we  could  in  theory  increase  offen¬ 
sive  yards  gained  by  playing  our  best  response  play.  However, 
without  prior  information  about  the  defense’s  choice  of  play, 
the  best  way  to  gain  this  competitive  advantage  is  through  ac¬ 
curate  early  play  recognition,  gaining  knowledge  of  the  other 
team’s  intent  sufficiently  early  in  the  play  to  exploit  holes  in 
the  current  defense. 

5  Play  Recognition  using  SVM 

In  this  paper  we  focus  on  intent  recognition  from  the  view¬ 
point  of  the  offense:  given  a  series  of  observations,  our  goal  is 
to  recognize  the  defensive  play  as  quickly  as  possible  in  order 
to  maximize  our  team’s  ability  to  intelligently  respond  with 
the  best  offense.  Thus,  the  observation  sequence  grows  with 
time  unlike  in  standard  offline  activity  recognition  where  the 
entire  set  of  observations  is  available.  We  approach  the  prob¬ 
lem  by  training  a  series  of  multi-class  discriminative  clas¬ 
sifiers,  each  of  which  is  designed  to  handle  observation  se¬ 
quences  of  a  particular  length.  In  general,  we  expect  that  the 


Table  2:  Yardages  for  Best  and  Worst  Offensive  Choices 


Offense 

Defense 

Best  (yds) 

Worst  (yds) 

Avg  (yds) 

power 

23 

10.84 

2.41 

5.82 

power 

31 

12.82 

2.04 

5.83 

power 

2222 

7.82 

3.67 

5.19 

power 

2231 

9.56 

3.82 

6.93 

pro 

23 

11.34 

2.52 

6.65 

pro 

31 

17.3 

4.77 

9.32 

pro 

2222 

23.71 

5.47 

9.99 

pro 

2231 

16.95 

6.78 

11.35 

split 

23 

14.74 

6.65 

10.26 

split 

31 

14.96 

3.53 

10.71 

split 

2222 

51.82 

6.01 

17.13 

split 

2231 

51.81 

7.43 

17.91 

early  classifiers  should  be  less  accurate  since  they  are  oper¬ 
ating  with  a  shorter  observation  vector  and  because  the  posi¬ 
tions  of  the  players  have  deviated  little  from  the  initial  forma¬ 
tion. 

There  are  12  initial  configurations  for  the  players  (choice 
of  3  formations  for  the  offense  and  4  for  the  defense).  We  for¬ 
mulate  the  problem  as  follows.  Let  A  =  {ai,  a2, . . . ,  a\o}  be 
the  set  of  agents  in  the  scenario,  where  the  index  of  each  agent 
maps  to  its  role,  as  given  by  the  starting  configuration.  We 
observe  the  2D  position  of  each  agent  at  every  time  step,  en¬ 
abling  us  to  construct  an  observation  vector,  at  time  t  that 
is  a  concatenation  of  the  observed  states  of  every  agent.  Thus, 
the  dimensionality  of  the  observation  vector  is  32 1.  Since  the 
offense  can  select  from  a  set  of  8  plays  from  each  initial  con¬ 
figuration,  the  goal  of  the  intent  recognition  is  to  output  a 
label  G  {1 ...  8},  and  the  baseline  accuracy  for  this  task 
is  12.5%. 

We  perform  this  classification  using  support  vector  ma¬ 
chines  [Vapnik,  1998].  Support  vector  machines  (SVM)  are 
a  supervised  binary  classification  algorithm  that  have  been 
demonstrated  to  perform  well  on  a  variety  of  pattern  classifi¬ 
cation  tasks,  particularly  when  the  dimensionality  of  the  data 
is  high  (as  in  our  case).  Intuitively  the  support  vector  machine 
projects  data  points  into  a  higher  dimensional  space,  specified 
by  a  kernel  function,  and  computes  a  maximum-margin  hy¬ 
perplane  decision  surface  that  separates  the  two  classes.  Sup¬ 
port  vectors  are  those  data  points  that  lie  closest  to  this  deci¬ 
sion  surface;  if  these  data  points  were  removed  from  the  train¬ 
ing  data,  the  decision  surface  would  change.  More  formally, 
given  a  labeled  training  set  {(xi,  yi),  (x2,  y2),  ■■■,  (xi,yi)}, 
where  x,  £  is  a  feature  vector  and  y,  £  {  — 1,+1}  is 
its  binary  class  label,  an  SVM  requires  solving  the  following 
optimization  problem: 

1  1 
min  -wTw  +  C  >  & 

w,b,£  2  ^ 

i=  1 

constrained  by: 

2/i(wr</>(xj)  +b)  >  1 

6  >  o. 

The  function  0(.)  that  maps  data  points  into  the  higher  di¬ 
mensional  space  is  not  explicitly  represented;  rather,  a  ker¬ 


nel  function,  K (x^ ,  Xj )  =  0(x^)0(xj),  is  used  to  implicitly 
specify  this  mapping.  In  our  application,  we  use  the  popular 
radial  basis  function  (RBF)  kernel: 

K(xi,Xj)  =  exp(-7||a:j  -  Xj\\2),^  >  0. 

Several  extensions  have  been  proposed  to  enable  SVMs  to 
operate  on  multi-class  problems  (with  k  rather  than  2  classes), 
such  as  one-vs-all,  one-vs-one,  and  error-correcting  output 
codes.  We  employ  a  standard  one-vs-one  voting  scheme 
where  all  pairwise  binary  classifiers,  k(k  —  l)/2  =  28  for 
every  multi-class  problem  in  our  case,  are  trained  and  the 
most  popular  class  is  selected.  When  multiple  classes  re¬ 
ceive  the  highest  vote,  we  select  the  winning  one  with  the 
lowest  index;  the  benefit  of  this  approach  is  that  classifica¬ 
tion  is  deterministic  but  it  can  bias  our  classification  in  fa¬ 
vor  of  lower-numbered  plays.  For  a  real  game  system,  we 
would  employ  a  randomized  tie-breaking  strategy.  Many  ef¬ 
ficient  implementations  of  SVMs  are  publicly  available;  we 
use  LIB  SVM  [Chang  and  Lin,  2001]. 

We  train  our  classifiers  using  a  collection  of  simulated 
games  in  Rush  collected  under  controlled  conditions:  40  in¬ 
stances  of  every  possible  combination  of  offense  (8)  and  de¬ 
fense  plays  (8),  from  each  of  the  12  starting  formation  con¬ 
figurations.  Since  the  starting  configuration  is  known,  each 
series  of  SVMs  is  only  trained  with  data  that  could  be  ob¬ 
served  starting  from  its  given  configuration.  For  each  config¬ 
uration,  we  create  a  series  of  training  sequences  that  accumu¬ 
lates  spatio-temporal  traces  from  £  =  0uptotG{2,...,10} 
time  steps.  A  multiclass  SVM  (i.e.,  a  collection  of  28  binary 
SVMs)  is  trained  for  each  of  these  cases.  Although  the  aggre¬ 
gate  number  of  binary  classifiers  is  large,  each  classifier  only 
employs  a  small  fraction  of  the  dataset  and  is  therefore  effi¬ 
cient  (and  highly  paralellizable).  Cross-validation  was  used 
to  tune  the  SVM  parameters  (C  and  cr)  for  all  of  the  SVMs. 

Classification  at  testing  time  is  very  fast  and  proceeds  as 
follows.  We  select  the  multiclass  SVM  that  is  relevant  to  the 
current  starting  configuration  and  time  step.  An  observation 
vector  of  the  correct  length  is  generated  (this  can  be  done 
incrementally  during  game  play)  and  fed  to  the  multi-class 
SVM.  The  output  of  the  intent  recognizer  is  the  system’s  best 
guess  (at  the  current  time  step)  about  the  opponent’s  choice  of 
defensive  play  and  can  help  us  to  select  the  most  appropriate 
offense,  as  discussed  below. 

5.1  Evaluation  of  Play  Recognition 

For  the  experiments  reported  in  this  paper,  we  collected  a  test¬ 
ing  set  of  Rush  games  with  10  instances  of  every  combination 
of  offensive  and  defensive  plays  from  each  of  the  different 
starting  configurations.  We  created  observation  vectors  from 
this  test  set  for  each  of  the  nine  selected  timesteps,  resulting 
in  a  data  set  with  108  configurations,  each  with  640  instances 
(10  instances  of  each  offense  play  vs.  a  defense  play).  The 
dimensionality  of  the  problem  ranged  from  64  (for  the  short¬ 
est  observation  vector)  to  320  (for  the  longest).  Deterministic 
tie  breaking  was  used  to  return  a  forced  choice  for  plays  with 
equal  numbers  of  votes  (lowest  play  number  wins). 

Table  3  summarizes  the  experimental  results  for  different 
lengths  of  the  observation  vector  (time  from  start  of  play), 


averaging  classification  accuracy  across  all  starting  forma¬ 
tion  choices  and  defense  choices.  We  see  that  at  the  earli¬ 
est  timestep,  our  classification  accuracy  is  at  the  baseline  but 
jumps  sharply  near  perfect  levels  at  t  =  3.  This  strongly  con¬ 
firms  the  feasibility  of  accurate  intent  recognition  in  Rush, 
even  during  very  early  stages  of  a  play.  At  t  =  2,  there 
is  insufficient  information  to  discriminate  between  defensive 
plays  (perceptual  aliasing),  however  by  t  =  3,  the  positions 
of  the  defensive  team  are  distinctive  enough  to  be  reliably 
recognized.  The  only  case  where  there  is  insufficient  infor¬ 
mation  to  discriminate  between  play  variants  is  when  the  de¬ 
fense  is  using  formation  23.  Play  variant  1  and  play  variant  2 
in  this  formation  are  extremely  similar,  differing  only  in  the 
deployment  of  2  players;  hence,  play  variant  1  is  consistently 
misclassified  as  being  play  variant  2  even  at  t  =  10. 

6  Offensive  Play  Switches 

To  improve  offensive  performance,  our  system  evaluates  the 
competitive  advantage  of  executing  a  play  switch  based  on 
1)  the  potential  of  other  plays  to  improve  the  yardage  gained 
and  2)  the  similarity  of  the  candidate  plays  to  the  current  play. 
First,  we  train  a  set  of  SVM  models  to  recognize  defensive 
plays  at  a  particular  time  horizon  as  described  in  the  previous 
section;  this  training  data  is  then  used  to  identity  promising 
play  switches.  A  play  switch  is  executed: 

1 .  after  the  defensive  play  has  been  identified  by  the  SVM 
classifier; 

2.  if  there  is  a  stronger  alternate  play  based  on  the  yardage 
history  of  that  play  vs.  the  defense; 

3.  if  the  candidate  play  is  sufficiently  similar  to  the  current 
play  to  be  feasible  for  immediate  execution. 

Rather  than  calculating  play  similarity  based  on  executions 
of  individual  traces,  for  every  play  combination,  we  create  a 
probability  distribution  model  (shown  in  Table  3)  to  describe 
the  players’  positions  over  time,  based  on  the  training  data. 
We  use  this  probabilistic  representation  of  the  team’s  spatial- 
temporal  traces  to  determine  the  similarity  between  the  plays, 
using  a  feature  set  described  in  the  next  section.  To  determine 
whether  to  execute  the  play  switch  for  a  particular  combina¬ 
tion  of  plays,  the  agent  considers  N ,  the  set  of  all  offensive 
plays  shown  to  gain  more  than  a  threshold  e  value.  The  agent 
then  selects  Min(n  G  N),  the  play  in  the  list  most  like  the 
current  play  for  each  play  configuration  and  caches  the  pre¬ 
ferred  play  in  a  lookup  table. 

When  a  play  is  executed,  the  agent  will  use  all  observations 
up  to  and  including  observation  3  to  determine  what  play  the 
defense  is  executing  before  performing  a  lookup  to  determine 
the  play  switch  to  make.  The  process  is  ended  with  execution 
of  a  change  order  to  all  members  of  the  offensive  team.  Cal¬ 
culating  the  feasibility  of  the  play  switch  based  on  play  sim¬ 
ilarity  is  a  crucial  part  of  improving  the  team’s  performance; 
in  the  results  section,  we  evaluate  our  similarity-based  play 
switch  mechanism  vs.  a  greedy  play  switching  algorithm  that 
focuses  solely  on  the  potential  for  yardage  gained. 

6.1  Play  Similarity  Metric 

To  calculate  play  similarities,  we  create  a  feature  matrix  for 
all  offensive  formation/play  combinations  based  on  the  train¬ 


Figure  3:  Single  execution  trace  from  Power-3  vs  2222-3  (top 
left)  and  Power-5  vs  2222-1  (top  right).  The  ball  position  is 
marked  in  red.  Probabilistic  trace  models  (bottom)  for  the 
same  plays;  increased  dot  sizes  indicate  higher  probabilities 
of  the  player  (or  ball)  being  at  a  particular  location  at  a  given 
time. 


ing  data.  To  store  the  spatio-temporal  traces  used  to  calculate 
play  similarity,  we  create  a  probability  distribution  model  of 
the  movements  l  of  all  offensive  players  A  =  {ai, . . . ,  ag} 
for  all  time  steps  t,  based  on  our  training  samples  (5).  Let 
^2S  A[t  indicate  the  number  of  times  player  A  visited  location 
l  at  time  =  t.  The  probability  athlete  A  will  visit  location  L 
at  time  t  is  calculated  using  the  formula: 


P(Ai  J 


Ss  Alt 
50 


For  each  player  and  every  location  the  player  visits,  we 
store  the  probability  the  player  will  be  at  a  specific  location  at 
a  given  time,  and  the  players  four  initial  movements  are  used 
to  create  a  feature  vector. 

The  features  collected  for  each  athlete  A  are 


Max(X):  The  rightmost  position  traveled  to  by  A 
Max(Y):  The  highest  position  traveled  to  by  A 
Min(X):  The  leftmost  position  traveled  to  by  A 
Min(Y):  The  lowest  position  traveled  to  by  A 

Mean(X):  = 

Mean(Y):  =  ^  n 
Median(X):  =  Sort(X)i/2 
Median(Y):  =  Sort(Y)i/2 

FirstToLastAngle:  Angle  from  starting  point  (xl,yl),  to 
ending  point  (x2,  y2),  is  defined  as  atari  ) 

Start^Vngle:  Angle  from  the  starting  point  (xo,yo)  to 
(rci ,  2/1),  defined  as  atari  ( |) 


Table  3:  Play  recognition  results 


Offense 

Defense 

t  =  2 

3 

4 

5 

6 

7 

8 

9 

10 

Power 

23 

12.50% 

87.50% 

87.50% 

87.20% 

87.28% 

87.24% 

87.24% 

86.94% 

86.83% 

Pro 

23 

12.50% 

87.50% 

87.50% 

87.57% 

87.24% 

87.65% 

87.61% 

87.83% 

87.54% 

Split 

23 

12.50% 

87.50% 

87.50% 

87.39% 

87.46% 

87.54% 

87.87% 

87.24% 

87.43% 

Power 

31 

12.50% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

Pro 

31 

12.50% 

100.00% 

99.96% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

Split 

31 

12.50% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

99.96% 

99.96% 

99.96% 

Power 

2231 

12.50% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

Pro 

2231 

12.50% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

Split 

2231 

12.51% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

99.96% 

99.93% 

Power 

2222 

12.47% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

Pro 

2222 

12.50% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

Split 

2222 

12.50% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

100.00% 

End_Angle:  Angle  from  the  starting  point  (xn-% 2/n-4)  to 
(xn,  yn),  defined  as  atari 

Total-Angle:  =  atari  (%+*!%) 

Total_Path_Distance:  =  E^o*  (\/a:<2  +  2/i2) 

Feature  set  F  for  a  given  play  c  contains  all  the  features  for 
each  offensive  player  in  the  play  and  is  described  as 


Fc  =  {Ac i  U  Acl  U  Ac2  U  •  •  •  U  Ac8j 

These  features  are  similar  to  the  ones  used  in  [Rubine, 
1991]  and  more  recently,  by  [Wobbrock  et  al. ,  2007]  to  match 
pen  trajectories  in  sketch-based  recognition  tasks,  but  gener¬ 
alized  to  handle  multi-player  trajectories.  To  compare  plays 
we  use  the  sum  of  the  absolute  value  of  the  differences  ( L\ 
norm)  between  each  feature  Fc:l .  This  information  is  used  to 
build  a  similarity  matrix  Mij  for  each  possible  offensive  play 
combination  as  defined  below. 

||Fc||-:l 

=  J2  aPc 

c= 0 


i,j  =  1---8 

There  is  one  matrix  M  for  each  offensive  formation 
0/3,  where  (3  =  {pro,  power,  split}  are  the  offensive 
formations.  Defensive  formation/play  combinations  are 
indicated  by  Dap ,  where  a  =  {23,  31,  2222,  2231}  and 
p  represents  plays  1..8.  M  for  a  specific  play  configuration 
is  expressed  as  OpDapMi ,  given  i  (1...8)  is  our  current 
offensive  play.  The  purpose  of  this  algorithm  is  to  find  a 
value  j  (play)  most  similar  to  i  (our  current  play),  with  a 
history  (based  on  earlier  observation)  of  scoring  the  most 
yardage.  This  process  is  accomplished  for  every  offensive 
play  formation  against  every  defensive  play  formation  and 
play  combination.  When  the  agent  is  constructing  the  lookup 
table  and  needs  to  determine  the  most  similar  play  from  a  list, 
given  current  play  i,  it  calls  the  method,  min(OpDapMi) 
which  returns  the  most  similar  play. 


7  Improving  the  Offense 

Our  algorithm  for  improving  Rush  offensive  play  has  two 
main  phases,  a  preprocess  stage  which  yields  a  play  switch 
lookup  table  and  an  execution  stage  where  the  defensive  play 
is  recognized  and  the  offense  responds  with  an  appropriate 
play  switch  for  that  defensive  play.  As  described  in  Section  5 
we  train  a  set  of  SVM  classifiers  using  40  instances  of  every 
possible  combination  of  offense  (8)  and  defense  plays  (8), 
from  each  of  the  12  starting  formation  configurations.  This 
stage  yields  a  set  of  models  used  for  play  recognition  during 
the  game.  Next,  we  calculate  and  cache  play  switches  using 
the  following  procedure: 

Step  1:  Collect  data  by  running  the  RUSH  2008  football 
simulator  50  times  for  every  play  combination. 

Step  2:  Create  yardage  lookup  tables  for  each  play  combina¬ 
tion.  This  information  alone  is  insufficient  to  determine 
how  good  a  potential  play  is  to  perform  the  play  switch 
action  on.  The  transition  play  must  resemble  our  current 
offensive  play  or  the  offensive  team  will  spend  too  much 
time  retracing  steps  and  perform  very  poorly. 

Step  3:  Create  a  probabilistic  trace  representation  for  all  50 
training  plays  with  field  locations  and  probabilities  of 
the  players  being  observed  in  these  locations. 

Step  4:  Create  feature  matrix  for  all  offensive  formation/play 
combinations  using  the  probabilistic  trace  representa¬ 
tion. 

Step  5:  Create  the  final  play  switch  lookup  table  based  on 
both  the  yardage  information  and  the  play  similarity. 

To  create  the  play  switch  lookup  table,  the  agent  first 
extracts  a  list  of  offensive  plays  L  given  the  requirement 
yards  (Li)  >  e  where  e  is  the  smallest  yardage  gained  in 
which  the  agent  does  not  consider  changing  the  current  of¬ 
fensive  play  to  another.  We  used  e  =  1.95  based  on  a 
quadratic  polynomial  fit  of  total  yardage  gained  in  6  tests  with 
e  =  {MIN,  1.1, 1.6,  2.1,  2.6,  MAX}  where  MIN  is  small 
enough  no  plays  are  selected  to  change  and  MAX  where  all 
plays  are  selected  for  change  to  the  highest  yardage  play  with 
no  similarity  comparison.  Second,  from  the  list  L  find  the 
play  most  similar  (smallest  value  in  the  matrix)  to  our  current 
play  i  using  Min(0 pDapMi)  and  add  it  to  the  lookup  file. 


During  execution,  the  offense  uses  the  following  proce¬ 
dure: 

1.  For  observation’s  t  <  4,  collect  movement  traces  for 
each  play. 

2.  At  t  =  3,  use  LIBSVM  with  collected  movement  traces 
and  the  trained  SVM  models  to  identify  the  defensive 
play. 

3.  Use  the  lookup  file  to  find  best(i)  for  the  current  play  i. 

4.  Send  a  change  order  command  to  the  offensive  team  to 
change  to  play  best(i). 


are  drastically  reduced.  The  reason  seems  to  be  associated 
player  miscoordinations  accidentally  induced  by  the  play 
switch;  by  maximizing  the  play  similarity  simultaneously,  the 
possibility  of  miscoordinations  is  reduced.  Figure  5  shows 
yardage  gained  by  the  best  play  switch  strategy  over  the 
Rush  baseline  offense.  Power  vs.  23  experiences  the  greatest 
enhancement  and  Split  vs.  31  the  least.  It  is  interesting  to 
note  Split  formations  in  the  baseline  performed  best  and 
improved  the  least  while  the  Power  formations  performed  the 
worst  in  the  baseline  and  improved  the  most.  This  indicates 
an  inversely  proportional  expected  gain  by  the  algorithm. 


8  Empirical  Evaluation 

The  algorithm  was  tested  using  the  RUSH  2008  simulator  for 
ten  plays  on  each  possible  play  configuration  in  three  separate 
trials.  We  compared  our  play  switch  model  (using  the  yardage 
threshold  e  =  1.95  as  determined  by  the  quadratic  fit)  to  the 
baseline  Rush  offense  and  to  a  greedy  play  switch  strategy 
(e  =  MAX )  based  solely  on  the  yardage. 

Overall,  the  average  performance  of  the  offense  went  from 
2.82  yards  per  play  to  3.65  yards  per  play  (e  =  1.95)  with  an 
overall  increase  of  29%,  ±1.5%  based  on  sampling  of  three 
sets  of  ten  trials.  An  analysis  of  each  of  the  formation  com¬ 
binations  (Figure  4)  shows  the  yardage  gain  varies  from  as 
much  as  100%  to  as  little  as  0.1%.  Overall,  performance  is 
consistently  better  for  every  configuration  tested.  In  all  cases, 
the  new  average  yardage  is  over  2.3  yards  per  play  with  no 
weak  plays  as  seen  in  the  baseline.  For  example,  Power  vs.  23 
(1.4  average  yards  per  play)  and  Power  vs.  2222  (1.3  average 
yards  per  play).  Results  with  e  =  MAX  clearly  shows  sim¬ 
ply  changing  to  the  greatest  yardage  generally  results  in  poor 
performance  from  the  offense. 
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Figure  4:  Comparison  of  play  switch  selection  methods.  Our 
play  switch  method  (shown  in  red)  outperforms  both  baseline 
Rush  offense  (blue)  and  a  greedy  play  switch  metric  (green). 


Power  vs.  23  is  dramatically  boosted  from  about  1.5  yards 
to  about  3  yards  per  play,  doubling  yards  gained.  Other 
combinations,  such  as  Split  vs.  23  and  Pro  vs.  32  already 
scored  good  yardage  and  improved  less  dramatically  at 
about  .2  to  .4  yards  more  than  gains  in  the  baseline  sample. 
In  4  we  see  all  the  split  configurations  do  quite  well;  this 
is  unsurprising  given  our  calculations  of  the  best  response. 
However,  when  the  threshold  is  not  in  use  and  the  plays  are 
allowed  to  change  regardless  of  current  yardage,  the  results 
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Figure  5 :  The  play-yardage  gain  over  baseline  Rush  offense 
yielded  by  our  play  switch  strategy. 


9  Discussion 

There  are  a  number  of  ways  the  offensive  team  can  utilize 
information  provided  by  an  early  play  recognition  system, 
other  than  through  our  play  switch  mechanism.  One  possibil¬ 
ity  would  be  to  use  the  play  information  to  predict  the  future 
positions  of  the  defensive  players  and  moving  towards  those 
coordinates  instead  of  the  ones  specified  in  the  initial  play. 

Another  question  is  how  to  utilize  less  accurate  play  recog¬ 
nition  to  improve  offensive  play.  Since  Rush  is  a  simulated 
game,  it  is  possible  for  the  intent  recognizer  to  perfectly  ob¬ 
serve  the  environment,  which  may  be  an  unrealistic  assump¬ 
tion  in  the  real  world.  To  address  this,  we  replicated  our  ex¬ 
periments  under  conditions  where  both  training  and  test  data 
were  corrupted  by  observation  noise  introduced  outside  the 
simulation  environment.  Classification  accuracy  at  t  =  3 
with  the  noisy  data  ranged  from  60%  to  90%.  In  this  case, 
it  would  be  possible  to  have  the  offensive  team  consider  the 
information  contained  in  the  classification  confusion  matrix 
when  evaluating  play  switches  by  maintaining  multiple  hy¬ 
potheses  about  the  defensive  play.  The  agent  could  use  risk- 
minimization  metrics  when  considering  possible  counter  re¬ 
sponses  to  multiple  hypotheses,  choosing  to  accept  smaller, 
but  less  risky  yardage  gains. 

Also  it  is  possible  that  poorer  play  recognition  might  not 
make  a  difference  in  all  cases  if  the  play  can  be  countered  in 
similar  ways.  For  instance,  in  formation  23,  our  SVM  classi¬ 
fier  persistently  misclassifies  play  variant  1  as  2,  resulting  in  a 
poor  accuracy.  However,  the  play  varies  only  subtly  in  player 
action  choice,  and  often  not  at  all  in  execution  trace.  In  this 


case,  even  though  the  classifier  performs  poorly,  the  resulting 
offense  would  be  unlikely  to  suffer. 

One  question  is  whether  the  assumption  of  playbook 
knowledge,  common  to  any  plan  recognition  system,  is  rea¬ 
sonable  for  this  domain,  since  real  teams  try  hard  to  keep 
their  playbooks  secret.  A  play  recognition  system  would 
have  to  compensate  for  this  in  the  same  way  that  real-life 
coaches  do — by  watching  and  analyzing  footage  from  pre¬ 
vious  games  or  from  earlier  in  the  same  game.  We  believe 
investigating  unsupervised  and  semi- supervised  approaches 
to  this  problem  is  a  fruitful  area  for  future  research;  we  have 
demonstrated  the  applicability  of  an  unsupervised  clustering 
approach  for  analyzing  Rush  football  plays  to  improving  re¬ 
inforcement  learning  [Molineaux  et  al. ,  2009] . 

10  Conclusion 

In  this  paper,  we  present  an  approach  for  early,  accurate 
recognition  of  defensive  plays  in  the  Rush  2008  football  sim¬ 
ulator.  We  demonstrate  that  a  multi-class  SVM  classifier 
trained  on  spatio-temporal  game  traces  can  enable  the  of¬ 
fense  to  correctly  anticipate  the  defense’s  play  by  the  third 
time  step.  Using  this  information  about  the  defense’s  in¬ 
tent,  our  system  evaluates  the  competitive  advantage  of  ex¬ 
ecuting  a  play  switch  based  on  the  potential  of  other  plays 
to  improve  the  yardage  gained  and  the  similarity  of  the  can¬ 
didate  plays  to  the  current  play.  Our  play  switch  selection 
mechanism  outperforms  both  the  built-in  Rush  offense  and  a 
greedy  yardage-based  switching  strategy,  increasing  yardage 
while  avoiding  the  miscoordinations  accidentally  induced  by 
the  greedy  strategy  during  the  transition  from  the  old  play  to 
the  new  one.  In  future  work,  we  plan  to  look  adapting  our 
current  method  to  be  more  robust  against  poor  play  recogni¬ 
tion. 
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